Overview

Dataset statistics

Number of variables18
Number of observations1452
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory972.1 KiB
Average record size in memory685.6 B

Variable types

Numeric8
Categorical6
Boolean4

Warnings

df_index is highly correlated with housing and 5 other fieldsHigh correlation
housing is highly correlated with df_index and 2 other fieldsHigh correlation
poutcome is highly correlated with df_index and 2 other fieldsHigh correlation
duration is highly correlated with subscribedHigh correlation
job is highly correlated with age and 1 other fieldsHigh correlation
age is highly correlated with job and 1 other fieldsHigh correlation
day is highly correlated with df_index and 3 other fieldsHigh correlation
subscribed is highly correlated with df_index and 5 other fieldsHigh correlation
month is highly correlated with df_index and 4 other fieldsHigh correlation
marital is highly correlated with ageHigh correlation
education is highly correlated with jobHigh correlation
pdays is highly correlated with df_index and 1 other fieldsHigh correlation
subscribed is highly correlated with month and 2 other fieldsHigh correlation
month is highly correlated with subscribed and 1 other fieldsHigh correlation
housing is highly correlated with subscribedHigh correlation
education is highly correlated with jobHigh correlation
poutcome is highly correlated with subscribed and 1 other fieldsHigh correlation
job is highly correlated with educationHigh correlation
df_index has unique values Unique
balance has 67 (4.6%) zeros Zeros

Reproduction

Analysis started2022-03-27 04:44:28.444860
Analysis finished2022-03-27 04:44:31.844865
Duration3.4 seconds
Software versionpandas-profiling v3.0.0
Download configurationconfig.json

Variables

df_index
Real number (ℝ≥0)

HIGH CORRELATION
UNIQUE

Distinct1452
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean905.9566116
Minimum0
Maximum1965
Zeros1
Zeros (%)0.1%
Negative0
Negative (%)0.0%
Memory size11.5 KiB
2022-03-27T05:44:31.887118image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile74.55
Q1372.75
median760.5
Q31454.25
95-th percentile1887.45
Maximum1965
Range1965
Interquartile range (IQR)1081.5

Descriptive statistics

Standard deviation599.5776082
Coefficient of variation (CV)0.661817134
Kurtosis-1.305319949
Mean905.9566116
Median Absolute Deviation (MAD)513
Skewness0.2103893999
Sum1315449
Variance359493.3082
MonotonicityStrictly increasing
2022-03-27T05:44:31.943113image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
01
 
0.1%
12791
 
0.1%
12771
 
0.1%
12761
 
0.1%
12751
 
0.1%
12741
 
0.1%
12731
 
0.1%
12721
 
0.1%
12711
 
0.1%
12691
 
0.1%
Other values (1442)1442
99.3%
ValueCountFrequency (%)
01
0.1%
11
0.1%
21
0.1%
31
0.1%
41
0.1%
51
0.1%
61
0.1%
71
0.1%
81
0.1%
91
0.1%
ValueCountFrequency (%)
19651
0.1%
19641
0.1%
19631
0.1%
19621
0.1%
19611
0.1%
19601
0.1%
19591
0.1%
19581
0.1%
19571
0.1%
19561
0.1%

age
Real number (ℝ≥0)

HIGH CORRELATION

Distinct64
Distinct (%)4.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean42.20454545
Minimum19
Maximum86
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size11.5 KiB
2022-03-27T05:44:31.997491image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/

Quantile statistics

Minimum19
5-th percentile27
Q133
median39
Q351
95-th percentile64
Maximum86
Range67
Interquartile range (IQR)18

Descriptive statistics

Standard deviation12.33408846
Coefficient of variation (CV)0.2922454993
Kurtosis0.2473494214
Mean42.20454545
Median Absolute Deviation (MAD)8
Skewness0.8168498292
Sum61281
Variance152.1297381
MonotonicityNot monotonic
2022-03-27T05:44:32.046314image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
3372
 
5.0%
3571
 
4.9%
3166
 
4.5%
3764
 
4.4%
3462
 
4.3%
3254
 
3.7%
3654
 
3.7%
3850
 
3.4%
3045
 
3.1%
2942
 
2.9%
Other values (54)872
60.1%
ValueCountFrequency (%)
192
 
0.1%
212
 
0.1%
2211
 
0.8%
238
 
0.6%
2410
 
0.7%
2510
 
0.7%
2623
1.6%
2728
1.9%
2838
2.6%
2942
2.9%
ValueCountFrequency (%)
861
 
0.1%
842
 
0.1%
823
0.2%
811
 
0.1%
806
0.4%
791
 
0.1%
781
 
0.1%
776
0.4%
756
0.4%
745
0.3%

job
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct11
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Memory size94.1 KiB
management
351 
technician
258 
blue-collar
212 
admin.
183 
retired
118 
Other values (6)
330 

Length

Max length13
Median length10
Mean length9.257575758
Min length6

Characters and Unicode

Total characters13442
Distinct characters22
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowtechnician
2nd rowtechnician
3rd rowretired
4th rowblue-collar
5th rowretired

Common Values

ValueCountFrequency (%)
management351
24.2%
technician258
17.8%
blue-collar212
14.6%
admin.183
12.6%
retired118
 
8.1%
services117
 
8.1%
unemployed54
 
3.7%
student51
 
3.5%
self-employed48
 
3.3%
entrepreneur33
 
2.3%

Length

2022-03-27T05:44:32.135449image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
management351
24.2%
technician258
17.8%
blue-collar212
14.6%
admin183
12.6%
retired118
 
8.1%
services117
 
8.1%
unemployed54
 
3.7%
student51
 
3.5%
self-employed48
 
3.3%
entrepreneur33
 
2.3%

Most occurring characters

ValueCountFrequency (%)
e2104
15.7%
n1572
11.7%
a1382
10.3%
m1014
 
7.5%
i961
 
7.1%
t862
 
6.4%
c845
 
6.3%
l786
 
5.8%
r664
 
4.9%
d481
 
3.6%
Other values (12)2771
20.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter12999
96.7%
Dash Punctuation260
 
1.9%
Other Punctuation183
 
1.4%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e2104
16.2%
n1572
12.1%
a1382
10.6%
m1014
7.8%
i961
7.4%
t862
 
6.6%
c845
 
6.5%
l786
 
6.0%
r664
 
5.1%
d481
 
3.7%
Other values (10)2328
17.9%
Dash Punctuation
ValueCountFrequency (%)
-260
100.0%
Other Punctuation
ValueCountFrequency (%)
.183
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin12999
96.7%
Common443
 
3.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
e2104
16.2%
n1572
12.1%
a1382
10.6%
m1014
7.8%
i961
7.4%
t862
 
6.6%
c845
 
6.5%
l786
 
6.0%
r664
 
5.1%
d481
 
3.7%
Other values (10)2328
17.9%
Common
ValueCountFrequency (%)
-260
58.7%
.183
41.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII13442
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e2104
15.7%
n1572
11.7%
a1382
10.3%
m1014
 
7.5%
i961
 
7.1%
t862
 
6.4%
c845
 
6.3%
l786
 
5.8%
r664
 
4.9%
d481
 
3.6%
Other values (12)2771
20.6%

marital
Categorical

HIGH CORRELATION

Distinct3
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size90.6 KiB
married
837 
single
454 
divorced
161 

Length

Max length8
Median length7
Mean length6.798209366
Min length6

Characters and Unicode

Total characters9871
Distinct characters13
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowsingle
2nd rowdivorced
3rd rowmarried
4th rowmarried
5th rowmarried

Common Values

ValueCountFrequency (%)
married837
57.6%
single454
31.3%
divorced161
 
11.1%

Length

2022-03-27T05:44:32.220326image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-03-27T05:44:32.253087image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
ValueCountFrequency (%)
married837
57.6%
single454
31.3%
divorced161
 
11.1%

Most occurring characters

ValueCountFrequency (%)
r1835
18.6%
i1452
14.7%
e1452
14.7%
d1159
11.7%
m837
8.5%
a837
8.5%
s454
 
4.6%
n454
 
4.6%
g454
 
4.6%
l454
 
4.6%
Other values (3)483
 
4.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter9871
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
r1835
18.6%
i1452
14.7%
e1452
14.7%
d1159
11.7%
m837
8.5%
a837
8.5%
s454
 
4.6%
n454
 
4.6%
g454
 
4.6%
l454
 
4.6%
Other values (3)483
 
4.9%

Most occurring scripts

ValueCountFrequency (%)
Latin9871
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
r1835
18.6%
i1452
14.7%
e1452
14.7%
d1159
11.7%
m837
8.5%
a837
8.5%
s454
 
4.6%
n454
 
4.6%
g454
 
4.6%
l454
 
4.6%
Other values (3)483
 
4.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII9871
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
r1835
18.6%
i1452
14.7%
e1452
14.7%
d1159
11.7%
m837
8.5%
a837
8.5%
s454
 
4.6%
n454
 
4.6%
g454
 
4.6%
l454
 
4.6%
Other values (3)483
 
4.9%

education
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct3
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size92.9 KiB
secondary
770 
tertiary
526 
primary
156 

Length

Max length9
Median length9
Mean length8.422865014
Min length7

Characters and Unicode

Total characters12230
Distinct characters13
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowtertiary
2nd rowsecondary
3rd rowsecondary
4th rowsecondary
5th rowsecondary

Common Values

ValueCountFrequency (%)
secondary770
53.0%
tertiary526
36.2%
primary156
 
10.7%

Length

2022-03-27T05:44:32.329490image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-03-27T05:44:32.360365image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
ValueCountFrequency (%)
secondary770
53.0%
tertiary526
36.2%
primary156
 
10.7%

Most occurring characters

ValueCountFrequency (%)
r2134
17.4%
a1452
11.9%
y1452
11.9%
e1296
10.6%
t1052
8.6%
s770
 
6.3%
c770
 
6.3%
o770
 
6.3%
n770
 
6.3%
d770
 
6.3%
Other values (3)994
8.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter12230
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
r2134
17.4%
a1452
11.9%
y1452
11.9%
e1296
10.6%
t1052
8.6%
s770
 
6.3%
c770
 
6.3%
o770
 
6.3%
n770
 
6.3%
d770
 
6.3%
Other values (3)994
8.1%

Most occurring scripts

ValueCountFrequency (%)
Latin12230
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
r2134
17.4%
a1452
11.9%
y1452
11.9%
e1296
10.6%
t1052
8.6%
s770
 
6.3%
c770
 
6.3%
o770
 
6.3%
n770
 
6.3%
d770
 
6.3%
Other values (3)994
8.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII12230
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
r2134
17.4%
a1452
11.9%
y1452
11.9%
e1296
10.6%
t1052
8.6%
s770
 
6.3%
c770
 
6.3%
o770
 
6.3%
n770
 
6.3%
d770
 
6.3%
Other values (3)994
8.1%

default
Boolean

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size1.5 KiB
False
1442 
True
 
10
ValueCountFrequency (%)
False1442
99.3%
True10
 
0.7%
2022-03-27T05:44:32.377185image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/

balance
Real number (ℝ)

ZEROS

Distinct1014
Distinct (%)69.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1462.069559
Minimum-980
Maximum81204
Zeros67
Zeros (%)4.6%
Negative69
Negative (%)4.8%
Memory size11.5 KiB
2022-03-27T05:44:32.409895image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/

Quantile statistics

Minimum-980
5-th percentile0
Q1208.75
median558.5
Q31659.5
95-th percentile5129.55
Maximum81204
Range82184
Interquartile range (IQR)1450.75

Descriptive statistics

Standard deviation3346.727232
Coefficient of variation (CV)2.289034205
Kurtosis237.1405251
Mean1462.069559
Median Absolute Deviation (MAD)489.5
Skewness11.91488485
Sum2122925
Variance11200583.16
MonotonicityNot monotonic
2022-03-27T05:44:32.547294image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
067
 
4.6%
2728
 
0.6%
15
 
0.3%
3935
 
0.3%
3035
 
0.3%
3924
 
0.3%
4884
 
0.3%
6554
 
0.3%
2284
 
0.3%
1054
 
0.3%
Other values (1004)1342
92.4%
ValueCountFrequency (%)
-9801
0.1%
-7351
0.1%
-6761
0.1%
-5571
0.1%
-5351
0.1%
-5281
0.1%
-4981
0.1%
-4641
0.1%
-4351
0.1%
-4211
0.1%
ValueCountFrequency (%)
812041
0.1%
293401
0.1%
290801
0.1%
276962
0.1%
263061
0.1%
207271
0.1%
179461
0.1%
169921
0.1%
169571
0.1%
153411
0.1%

housing
Boolean

HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size1.5 KiB
True
733 
False
719 
ValueCountFrequency (%)
True733
50.5%
False719
49.5%
2022-03-27T05:44:32.579656image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/

loan
Boolean

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size1.5 KiB
False
1238 
True
214 
ValueCountFrequency (%)
False1238
85.3%
True214
 
14.7%
2022-03-27T05:44:32.593146image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/

contact
Categorical

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size92.4 KiB
cellular
1344 
telephone
 
108

Length

Max length9
Median length8
Mean length8.074380165
Min length8

Characters and Unicode

Total characters11724
Distinct characters11
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowcellular
2nd rowcellular
3rd rowcellular
4th rowcellular
5th rowcellular

Common Values

ValueCountFrequency (%)
cellular1344
92.6%
telephone108
 
7.4%

Length

2022-03-27T05:44:32.656555image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-03-27T05:44:32.682568image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
ValueCountFrequency (%)
cellular1344
92.6%
telephone108
 
7.4%

Most occurring characters

ValueCountFrequency (%)
l4140
35.3%
e1668
14.2%
c1344
 
11.5%
u1344
 
11.5%
a1344
 
11.5%
r1344
 
11.5%
t108
 
0.9%
p108
 
0.9%
h108
 
0.9%
o108
 
0.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter11724
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
l4140
35.3%
e1668
14.2%
c1344
 
11.5%
u1344
 
11.5%
a1344
 
11.5%
r1344
 
11.5%
t108
 
0.9%
p108
 
0.9%
h108
 
0.9%
o108
 
0.9%

Most occurring scripts

ValueCountFrequency (%)
Latin11724
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
l4140
35.3%
e1668
14.2%
c1344
 
11.5%
u1344
 
11.5%
a1344
 
11.5%
r1344
 
11.5%
t108
 
0.9%
p108
 
0.9%
h108
 
0.9%
o108
 
0.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII11724
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
l4140
35.3%
e1668
14.2%
c1344
 
11.5%
u1344
 
11.5%
a1344
 
11.5%
r1344
 
11.5%
t108
 
0.9%
p108
 
0.9%
h108
 
0.9%
o108
 
0.9%

day
Real number (ℝ≥0)

HIGH CORRELATION

Distinct31
Distinct (%)2.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean14.1184573
Minimum1
Maximum31
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size11.5 KiB
2022-03-27T05:44:32.710747image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2
Q14
median13
Q325
95-th percentile29
Maximum31
Range30
Interquartile range (IQR)21

Descriptive statistics

Standard deviation10.21740334
Coefficient of variation (CV)0.7236912025
Kurtosis-1.437089309
Mean14.1184573
Median Absolute Deviation (MAD)9
Skewness0.304773524
Sum20500
Variance104.395331
MonotonicityNot monotonic
2022-03-27T05:44:32.756281image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
Histogram with fixed size bins (bins=31)
ValueCountFrequency (%)
29164
 
11.3%
2141
 
9.7%
4138
 
9.5%
388
 
6.1%
581
 
5.6%
2874
 
5.1%
670
 
4.8%
3058
 
4.0%
1352
 
3.6%
1750
 
3.4%
Other values (21)536
36.9%
ValueCountFrequency (%)
134
 
2.3%
2141
9.7%
388
6.1%
4138
9.5%
581
5.6%
670
4.8%
712
 
0.8%
820
 
1.4%
949
 
3.4%
1023
 
1.6%
ValueCountFrequency (%)
317
 
0.5%
3058
 
4.0%
29164
11.3%
2874
5.1%
2721
 
1.4%
2625
 
1.7%
2520
 
1.4%
2416
 
1.1%
2321
 
1.4%
2225
 
1.7%

month
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct12
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Memory size85.2 KiB
feb
388 
jan
228 
apr
144 
may
123 
jul
105 
Other values (7)
464 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters4356
Distinct characters19
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowapr
2nd rowapr
3rd rowapr
4th rowapr
5th rowapr

Common Values

ValueCountFrequency (%)
feb388
26.7%
jan228
15.7%
apr144
 
9.9%
may123
 
8.5%
jul105
 
7.2%
aug98
 
6.7%
sep93
 
6.4%
jun67
 
4.6%
oct65
 
4.5%
nov53
 
3.7%
Other values (2)88
 
6.1%

Length

2022-03-27T05:44:32.847103image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
feb388
26.7%
jan228
15.7%
apr144
 
9.9%
may123
 
8.5%
jul105
 
7.2%
aug98
 
6.7%
sep93
 
6.4%
jun67
 
4.6%
oct65
 
4.5%
nov53
 
3.7%
Other values (2)88
 
6.1%

Most occurring characters

ValueCountFrequency (%)
a642
14.7%
e520
11.9%
j400
9.2%
f388
8.9%
b388
8.9%
n348
8.0%
u270
 
6.2%
p237
 
5.4%
r193
 
4.4%
m172
 
3.9%
Other values (9)798
18.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter4356
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a642
14.7%
e520
11.9%
j400
9.2%
f388
8.9%
b388
8.9%
n348
8.0%
u270
 
6.2%
p237
 
5.4%
r193
 
4.4%
m172
 
3.9%
Other values (9)798
18.3%

Most occurring scripts

ValueCountFrequency (%)
Latin4356
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a642
14.7%
e520
11.9%
j400
9.2%
f388
8.9%
b388
8.9%
n348
8.0%
u270
 
6.2%
p237
 
5.4%
r193
 
4.4%
m172
 
3.9%
Other values (9)798
18.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII4356
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a642
14.7%
e520
11.9%
j400
9.2%
f388
8.9%
b388
8.9%
n348
8.0%
u270
 
6.2%
p237
 
5.4%
r193
 
4.4%
m172
 
3.9%
Other values (9)798
18.3%

duration
Real number (ℝ≥0)

HIGH CORRELATION

Distinct570
Distinct (%)39.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean276.6336088
Minimum7
Maximum1823
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size11.5 KiB
2022-03-27T05:44:32.891836image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/

Quantile statistics

Minimum7
5-th percentile58.55
Q1135
median224.5
Q3356
95-th percentile674.9
Maximum1823
Range1816
Interquartile range (IQR)221

Descriptive statistics

Standard deviation208.1655544
Coefficient of variation (CV)0.7524955313
Kurtosis6.067811471
Mean276.6336088
Median Absolute Deviation (MAD)103.5
Skewness1.98096399
Sum401672
Variance43332.89806
MonotonicityNot monotonic
2022-03-27T05:44:32.941637image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
9010
 
0.7%
20010
 
0.7%
15010
 
0.7%
1879
 
0.6%
1218
 
0.6%
2368
 
0.6%
1648
 
0.6%
1248
 
0.6%
758
 
0.6%
898
 
0.6%
Other values (560)1365
94.0%
ValueCountFrequency (%)
72
0.1%
82
0.1%
91
0.1%
121
0.1%
131
0.1%
151
0.1%
162
0.1%
171
0.1%
201
0.1%
211
0.1%
ValueCountFrequency (%)
18231
0.1%
14721
0.1%
14051
0.1%
12071
0.1%
12051
0.1%
11931
0.1%
11781
0.1%
11761
0.1%
11601
0.1%
11561
0.1%

campaign
Real number (ℝ≥0)

Distinct11
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.943526171
Minimum1
Maximum11
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size11.5 KiB
2022-03-27T05:44:32.984905image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median1
Q32
95-th percentile5
Maximum11
Range10
Interquartile range (IQR)1

Descriptive statistics

Standard deviation1.407709683
Coefficient of variation (CV)0.7243070374
Kurtosis5.969865496
Mean1.943526171
Median Absolute Deviation (MAD)0
Skewness2.196232061
Sum2822
Variance1.981646551
MonotonicityNot monotonic
2022-03-27T05:44:33.018961image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
Histogram with fixed size bins (bins=11)
ValueCountFrequency (%)
1755
52.0%
2379
26.1%
3154
 
10.6%
472
 
5.0%
539
 
2.7%
628
 
1.9%
714
 
1.0%
95
 
0.3%
84
 
0.3%
101
 
0.1%
ValueCountFrequency (%)
1755
52.0%
2379
26.1%
3154
 
10.6%
472
 
5.0%
539
 
2.7%
628
 
1.9%
714
 
1.0%
84
 
0.3%
95
 
0.3%
101
 
0.1%
ValueCountFrequency (%)
111
 
0.1%
101
 
0.1%
95
 
0.3%
84
 
0.3%
714
 
1.0%
628
 
1.9%
539
 
2.7%
472
 
5.0%
3154
10.6%
2379
26.1%

pdays
Real number (ℝ≥0)

HIGH CORRELATION

Distinct350
Distinct (%)24.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean217.3663912
Minimum1
Maximum854
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size11.5 KiB
2022-03-27T05:44:33.063125image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile88
Q1177
median199
Q3261
95-th percentile407.9
Maximum854
Range853
Interquartile range (IQR)84

Descriptive statistics

Standard deviation106.6536259
Coefficient of variation (CV)0.4906629094
Kurtosis7.191928958
Mean217.3663912
Median Absolute Deviation (MAD)54
Skewness1.819341652
Sum315616
Variance11374.99592
MonotonicityNot monotonic
2022-03-27T05:44:33.111056image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
18260
 
4.1%
18160
 
4.1%
9249
 
3.4%
18334
 
2.3%
9134
 
2.3%
18429
 
2.0%
27220
 
1.4%
19619
 
1.3%
25219
 
1.3%
24518
 
1.2%
Other values (340)1110
76.4%
ValueCountFrequency (%)
17
0.5%
22
 
0.1%
41
 
0.1%
51
 
0.1%
61
 
0.1%
141
 
0.1%
191
 
0.1%
211
 
0.1%
241
 
0.1%
291
 
0.1%
ValueCountFrequency (%)
8541
0.1%
8421
0.1%
8281
0.1%
8051
0.1%
8041
0.1%
7921
0.1%
7841
0.1%
7761
0.1%
7691
0.1%
7611
0.1%

previous
Real number (ℝ≥0)

Distinct25
Distinct (%)1.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.335399449
Minimum1
Maximum55
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size11.5 KiB
2022-03-27T05:44:33.153349image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median2
Q34
95-th percentile9
Maximum55
Range54
Interquartile range (IQR)3

Descriptive statistics

Standard deviation3.580294917
Coefficient of variation (CV)1.07342313
Kurtosis61.87745737
Mean3.335399449
Median Absolute Deviation (MAD)1
Skewness5.846377179
Sum4843
Variance12.81851169
MonotonicityNot monotonic
2022-03-27T05:44:33.194297image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
Histogram with fixed size bins (bins=25)
ValueCountFrequency (%)
1406
28.0%
2379
26.1%
3230
15.8%
4144
 
9.9%
580
 
5.5%
655
 
3.8%
741
 
2.8%
831
 
2.1%
920
 
1.4%
1019
 
1.3%
Other values (15)47
 
3.2%
ValueCountFrequency (%)
1406
28.0%
2379
26.1%
3230
15.8%
4144
 
9.9%
580
 
5.5%
655
 
3.8%
741
 
2.8%
831
 
2.1%
920
 
1.4%
1019
 
1.3%
ValueCountFrequency (%)
551
0.1%
511
0.1%
381
0.1%
291
0.1%
271
0.1%
232
0.1%
201
0.1%
191
0.1%
172
0.1%
162
0.1%

poutcome
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct3
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size90.6 KiB
failure
905 
success
411 
other
136 

Length

Max length7
Median length7
Mean length6.812672176
Min length5

Characters and Unicode

Total characters9892
Distinct characters12
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowfailure
2nd rowfailure
3rd rowfailure
4th rowfailure
5th rowfailure

Common Values

ValueCountFrequency (%)
failure905
62.3%
success411
28.3%
other136
 
9.4%

Length

2022-03-27T05:44:33.282425image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-03-27T05:44:33.314617image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
ValueCountFrequency (%)
failure905
62.3%
success411
28.3%
other136
 
9.4%

Most occurring characters

ValueCountFrequency (%)
e1452
14.7%
u1316
13.3%
s1233
12.5%
r1041
10.5%
f905
9.1%
a905
9.1%
i905
9.1%
l905
9.1%
c822
8.3%
o136
 
1.4%
Other values (2)272
 
2.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter9892
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e1452
14.7%
u1316
13.3%
s1233
12.5%
r1041
10.5%
f905
9.1%
a905
9.1%
i905
9.1%
l905
9.1%
c822
8.3%
o136
 
1.4%
Other values (2)272
 
2.7%

Most occurring scripts

ValueCountFrequency (%)
Latin9892
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e1452
14.7%
u1316
13.3%
s1233
12.5%
r1041
10.5%
f905
9.1%
a905
9.1%
i905
9.1%
l905
9.1%
c822
8.3%
o136
 
1.4%
Other values (2)272
 
2.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII9892
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e1452
14.7%
u1316
13.3%
s1233
12.5%
r1041
10.5%
f905
9.1%
a905
9.1%
i905
9.1%
l905
9.1%
c822
8.3%
o136
 
1.4%
Other values (2)272
 
2.7%

subscribed
Boolean

HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size1.5 KiB
False
795 
True
657 
ValueCountFrequency (%)
False795
54.8%
True657
45.2%
2022-03-27T05:44:33.331416image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/

Interactions

2022-03-27T05:44:28.836009image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-03-27T05:44:28.882622image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-03-27T05:44:28.927615image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-03-27T05:44:28.970507image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-03-27T05:44:29.014046image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-03-27T05:44:29.058450image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-03-27T05:44:29.100583image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-03-27T05:44:29.143027image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-03-27T05:44:29.185804image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-03-27T05:44:29.234502image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-03-27T05:44:29.276264image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-03-27T05:44:29.316201image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-03-27T05:44:29.356967image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-03-27T05:44:29.570913image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-03-27T05:44:29.610923image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-03-27T05:44:29.650628image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-03-27T05:44:29.690627image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-03-27T05:44:29.730389image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-03-27T05:44:29.768674image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-03-27T05:44:29.805960image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-03-27T05:44:29.843757image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-03-27T05:44:29.883308image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-03-27T05:44:29.920654image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-03-27T05:44:29.957788image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-03-27T05:44:29.995773image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-03-27T05:44:30.038401image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-03-27T05:44:30.079308image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-03-27T05:44:30.119374image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-03-27T05:44:30.160326image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-03-27T05:44:30.202321image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-03-27T05:44:30.241259image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-03-27T05:44:30.280608image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-03-27T05:44:30.321049image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-03-27T05:44:30.365379image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-03-27T05:44:30.407598image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-03-27T05:44:30.448746image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-03-27T05:44:30.490850image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-03-27T05:44:30.534131image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-03-27T05:44:30.576831image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-03-27T05:44:30.617702image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-03-27T05:44:30.659566image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-03-27T05:44:30.700466image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-03-27T05:44:30.738867image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-03-27T05:44:30.776473image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-03-27T05:44:30.814872image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-03-27T05:44:30.855005image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-03-27T05:44:30.961930image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-03-27T05:44:30.999615image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-03-27T05:44:31.037369image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-03-27T05:44:31.076910image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-03-27T05:44:31.115313image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-03-27T05:44:31.152414image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-03-27T05:44:31.190233image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-03-27T05:44:31.229731image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-03-27T05:44:31.266575image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-03-27T05:44:31.303034image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-03-27T05:44:31.340819image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-03-27T05:44:31.381955image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-03-27T05:44:31.421568image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-03-27T05:44:31.460517image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-03-27T05:44:31.499732image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-03-27T05:44:31.540591image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-03-27T05:44:31.579226image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-03-27T05:44:31.617422image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/

Correlations

2022-03-27T05:44:33.355945image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2022-03-27T05:44:33.411075image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2022-03-27T05:44:33.463582image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2022-03-27T05:44:33.522331image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.
2022-03-27T05:44:33.590839image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

2022-03-27T05:44:31.694364image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
A simple visualization of nullity by column.
2022-03-27T05:44:31.807552image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

First rows

df_indexagejobmaritaleducationdefaultbalancehousingloancontactdaymonthdurationcampaignpdayspreviouspoutcomesubscribed
0032.0techniciansingletertiaryno392yesnocellular1apr95721312failureno
1139.0techniciandivorcedsecondaryno688yesyescellular1apr23321331failureno
2259.0retiredmarriedsecondaryno1035yesyescellular1apr12622391failureno
3347.0blue-collarmarriedsecondaryno398yesyescellular1apr27412382failureno
4454.0retiredmarriedsecondaryno1004yesnocellular1apr47913071failureno
5546.0self-employeddivorcedtertiaryno926yesnocellular1apr46311333failureno
6634.0blue-collarmarriedsecondaryno1924yesyescellular1apr16122531failureno
7745.0servicesdivorcedsecondaryno396yesyescellular1apr25143294failureno
8858.0managementdivorcedtertiaryno315yesnocellular1apr12121352failureno
9949.0managementdivorcedtertiaryno20727nonocellular1apr28531322failureno

Last rows

df_indexagejobmaritaleducationdefaultbalancehousingloancontactdaymonthdurationcampaignpdayspreviouspoutcomesubscribed
1442195633.0managementsingletertiaryno224nonocellular23sep4092924successyes
1443195759.0managementmarriedtertiaryno5397nonocellular23sep2441923successyes
1444195830.0self-employedsingletertiaryno655nonocellular23sep27241841successyes
1445195940.0managementsingletertiaryno3840yesnocellular24sep23224092successyes
1446196049.0managementmarriedtertiaryno1167yesyescellular24sep24919114successyes
1447196155.0servicesdivorcedsecondaryno0nonocellular27sep26261934successyes
1448196238.0servicesmarriedsecondaryno2678nonocellular28sep28221871successyes
1449196348.0managementsingletertiaryno334yesnocellular28sep60029212successyes
1450196461.0retiredmarriedsecondaryno11nonocellular29sep2321923successyes
1451196528.0managementsingletertiaryno390nonocellular29sep84512324successyes